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TITLE OF THE INVENTION 

HCV RNA-DEPENDENT RNA POLYMERASE 

BACKGROUND OF THE INVENTION 
5 The references cited in the present application are not admitted to be prior art to 

the claimed invention. 

It is estimated that about 3% of the world's population are infected with the 
Hepatitis C virus (HCV). (Wasley et al, 2000. Semin. Liver Dis. 20, 1-16.) Exposure to HCV 
results in an overt acute disease in a small percentage of cases, while in most instances the virus 
10 establishes a chronic infection causing liver inflammation and slowly progresses into liver 
failure and cirrhosis. (Iwarson, 1994. FEMS Microbiol Rev. 74,201-204.) Epidemiological 
surveys indicate HCV plays an important role in hepatocellular carcinoma pathogenesis. (Kew, 
1994. FEMS Microbiol. Rev. 14, 211-220, Alter, 1995. Blood 85, 1681-1695.) 

The HCV genome consists of a single strand RNA about 9.5 kb in length, 
15 encoding a precursor polyprotein about 3000 amino acids. (Choo et al, 1989. Science 244, 362- 
364, Choo et al, 1989. Science 244, 359-362, Takamizawa et al, 1991. J. Virol 65, 1105- 
1113.) The HCV polyprotein contains the viral proteins in the order: C-El-E2-p7-NS2-NS3- 
NS4A-NS4B-NS5A-NS5B. 

Individual viral proteins are produced by proteolysis of the HCV polyprotein. 
20 Host cell proteases release the putative structural proteins C, El, E2, and p7, and create the N- 

terminus of NS2 at amino acid 810. (Mizushima et al, 1994. J. Virol 68, 273 1-2734, Hijikata et 
al, 1993. Proc. Natl. Acad. Sci. USA 90, 10773-10777.) 

The non-structural proteins NS3, NS4A, NS4B, NS5A and NS5B presumably 
form the virus replication machinery and are released from the polyprotein. A zinc-dependent 
25 protease associated with NS2 and the N-terminus of NS3 is responsible for cleavage between 
NS2 and NS3. (Grakoui et al, 1993. J. Virol 67, 1385-1395, Hijikata et al, 1993. Proc. Natl 
Acad. Sci. USA 90, 10773-10777.) 

A distinct serine protease located in the N-terminal domain of NS3 is responsible 
for proteolytic cleavages at the NS3/NS4A, NS4A/NS4B, NS4B/NS5A and NS5A/NS5B 
30 junctions. (Barthenschlager et al, 1993. J. Virol. 67, 3835-3844, Grakoui et al, 1993. Proc. 
Natl Acad. Sci. USA 90, 10583-10587, Tomei et al, 1993. J. Virol. 67, 4017-4026.) RNA 
stimulated NTPase and helicase activities are located in the C-terminal domain of NS3. 

NS4 A provides a cofactor for NS3 protease activity. (Failla et al, J. Virol. 1994. 
68, 3753-3760, De Francesco et al, U.S. Patent No. 5,739,002.) 
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NS5A is a highly phosphorylated protein conferring interferon resistance. (De 
Francesco et al, U.S. Patent No. 6,383,768, Pawlotsky 1999. J. Viral Hepat. Suppl. 1, 47-48.) 

NS5B provides an RNA-dependent RNA polymerase. (De Francesco et al, 
International Publication Number WO 96/37619, published November 28, 1996, Behrens et al, 
5 1996. EMBO 15, 12-22, Lohmann et al, 1998. Virology 249, 108-1 18.) Soluble RNA- 
dependent RNA polymerase can be produced by a 2 1 amino acid truncation at the C terminus. 
(Yamashita et al, The Journal of Biological Chemistry 273:15479-15486, 1998, Ferrari et al, 
Journal of Virology 75:1649-1654, 1999.) 

Different genotypes and quasispecies of HCV have been identified. (Farci et al, 
10 Seminars in Liver Disease 20:103-126, 2000, Okamoto et al, Virology 755:331-341, 1992.) 

SUMMARY OF THE INVENTION 

The present invention features NS5B polypeptides from different clinically 
important HCV genotypes. The polypeptides can be used individually, or as part of a panel of 
15 RNA-dependent RNA polymerases, to evaluate the effectiveness of a compound to inhibit NS5B 
activity. 

Thus, a first aspect of the present invention describes a purified polypeptide 
comprising an amino acid sequence selected from the group consisting of: SEQ ID NO: 1, SEQ 
ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4 and SEQ ID NO: 5. A "purified polypeptide" is 

20 present in an environment lacking one or more other polypeptides with which it is naturally 
associated and/or is represented by at least about 10% of the total protein present. 

In different embodiments, the purified polypeptide represents at least about 50%, 
at least about 75%, or at least about 95% of the total protein in a sample or preparation. 
Reference to "purified polypeptide" does not require that the polypeptide has undergone any 

25 purification and may include, for example, chemically synthesized polypeptide that has not been 
purified. 

Another aspect of the present invention describes a recombinant nucleic acid 
comprising a nucleotide sequence encoding an amino acid sequence selected from the group 
consisting of: SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4 and SEQ ID NO: 5. 
30 A recombinant nucleic acid is nucleic acid that by virtue of its sequence and/or form does not 
occur in nature. The form of the nucleic acid is provided by its association with other nucleic 
acids found in nature, such the absence of one or more other nucleic acid regions naturally 
associated with a particular nucleic acid {e.g., upstream or downstream regions) and/or purified 
nucleic acid. 
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Another aspect of the present invention describes a method of evaluating the 
ability of a compound to inhibit HCV RNA-dependent RNA polymerase. The method involves 
measuring the ability of the compound to inhibit the activity of one or more HCV RNA- 
dependent RNA polymerases having an amino acid sequence selected from the group consisting 
5 of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4 and SEQ ID NO: 5. 

Unless particular terms are mutually exclusive, reference to "or" indicates either 
or both possibilities. Occasionally phrases such as "and/or" are used to highlight either or both 
possibilities. 

Reference to "comprises" is open-ended allowing for additional elements or 
10 steps. Occasionally phrases such as "one or more" are used with or without "comprises" to 
highlight the possibility of additional elements or steps. 

Unless explicitly stated reference to terms such as "a" or "an" is not limited to 
one. For example, "a cell" does not exclude "cells". Occasionally phrases such as one or more 
are used to highlight the presence of a plurality. 
15 Other features and advantages of the present invention are apparent from the 

additional descriptions provided herein including the different examples. The provided 
examples illustrate different components and methodology useful in practicing the present 
invention. The examples do not limit the claimed invention. Based on the present disclosure the 
skilled artisan can identify and employ other components and methodology useful for practicing 
20 the present invention. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figures 1A-1E provide the amino acid sequence for different HCV NS5B 
sequences. Figure 1 A illustrates SEQ ID NO: 1, Figure IB illustrates SEQ ID NO: 2, Figure 1C 
25 illustrates SEQ ID NO: 3, Figure ID illustrates SEQ ID NO: 4, and Figure IE illustrates SEQ ID 
NO: 5. 

Figures 2A-2E provide nucleotide sequences encoding SEQ ID NO: 1-5. Figure 
2A (SEQ ID NO: 6) illustrates the nucleotide sequence encoding SEQ ID NO: 1. Figure 2B 
(SEQ ID NO: 7) illustrates the nucleotide sequence encoding SEQ ID NO: 2. Figure 2C (SEQ ID 
30 NO: 8) illustrates the nucleotide sequence encoding SEQ ID NO: 3. Figure 2D (SEQ ID NO: 9) 
illustrates the nucleotide sequence encoding SEQ ID NO: 4. Figure 2E (SEQ ID NO: 10) 
illustrates the nucleotide sequence encoding SEQ ID NO: 5. 
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DETAILED DESCRIPTION OF THE INVENTION 

SEQ ID NOs: 1-5 provide NS5B sequences from different HCV genotypes. SEQ 
ID NO: 1 is from HCV genotype 2a. SEQ ID NO: 2 is from HCV genotype 2b. SEQ ID NO: 3 
is from genotype 3a. SEQ ID NO: 4 is from genotype 4a. SEQ ID NO: 5 is from genotype 6a. 
5 SEQ ID NOs: 1-5 are all modified NS5B sequences containing an amino terminus methionine 
and a carboxyl terminus 2 1 amino acid deletion. 

SEQ ID NOs: 1-5 provide polypeptides having RNA-dependent RNA polymerase 
activity. The polypeptides have different uses, such as providing RNA-dependent RNA 
polymerase activity based on different sequences and being used to evaluate the ability of a 
10 compound to inhibit HCV RNA-dependent RNA polymerase activity. 

The polypeptides can be used individually, or as part of a panel of RNA- 
dependent RNA polymerases, to evaluate the effectiveness of a compound to inhibit HCV RNA- 
dependent RNA polymerase activity. Compounds affecting HCV NS5B activity have research 
and therapeutic applications. Research applications include using the compounds as a tool to 
15 study RNA-dependent RNA polymerases activity. Therapeutic applications include using those 
compounds having appropriate pharmacological properties such as efficacy and lack of 
unacceptable toxicity to treat or inhibit onset of HCV in a patient. 

NS5B Sequences 

20 NS5B sequences described herein include polypeptides containing a region 

structurally related to SEQ ID NOs: 1, 2, 3, 4 or 5. A polypeptide region "structurally related" 
to a reference polypeptide contains an amino acid identity of at least 90% to the reference 
polypeptide. Polypeptides containing a region structurally related to SEQ ID NOs: 1, 2, 3, 4 or 
5 can also contain additional polypeptide regions that may or may not be related to NS5B. 

25 Percent identity to a reference sequence is determined by aligning the polypeptide 

sequence with the reference sequence and determining the number of identical amino acids in 
the corresponding regions. This number is divided by the total number of amino acids in the 
reference sequence (e.g., SEQ ID NO: 1) and then multiplied by 100 and rounded to the nearest 
whole number. 

30 Using SEQ ID NOs: 1, 2, 3, 4 or 5 as a frame of reference, alterations to the 

sequence can be made taking into account the known properties of amino acids. Alterations 
include one or more amino acid additions, deletions, and/or substitutions. The overall effect of 
different alterations can be evaluated using techniques described herein to confirm the ability of 
a particular polypeptide to provide RNA-dependent RNA polymerase activity. 
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Generally, in substituting different amino acids to retain activity it is preferable to 
exchange amino acids having similar properties. Factors that can be taken into account for an 
amino acid substitution include amino acid size, charge, polarity, and hydrophobicity. The 
effect of different amino acid R-groups on amino acid properties are well known in the art. (See, 
5 for example, Ausubel, Current Protocols in Molecular Biology, John Wiley, 1987-2002, 
Appendix 1C.) 

In exchanging amino acids to maintain activity, the replacement amino acid 
should have one or more similar properties such as approximately the same charge and/or size 
and/or polarity and/or hydrophobicity. For example, substituting valine for leucine, arginine for 
10 lysine, and asparagine for glutamine are good candidates for not causing a change in polypeptide 
functioning. 

Alterations to achieve a particular purpose include those designed to facilitate 
production or efficacy of the polypeptide; or cloning of the encoded nucleic acid. Polypeptide 
production can be facilitated through the use of an initiation codon {e.g., coding for methionine) 
15 suitable for recombinant expression. Cloning can be facilitated by, for example, the introduction 
of restriction sites which can be accompanied by amino acid additions or changes. 

Additional regions can be added to, for example, facilitate polypeptide 
purification or identification. Examples of groups that can be used to facilitate purification or 
identification include polypeptides providing tags such as a six-histidine tag, trpE, glutathione 
20 and maltose-binding protein. 

In different embodiments, the SEQ ID NOs: 1, 2, 3, 4 or 5 polypeptide comprises, 
consists essentially, or consists, of a sequence at 90%, at least 95%, or at least 99% identical to 
SEQ ID NOs: 1, 2, 3, 4 or 5; or differing from SEQ ID NOs: 1, 2, 3, 4 or 5 by 0, 1, 2, 3, 4, 5, 6, 
7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, or 19 amino acid alterations. 

25 

Polypeptide Production and Purification 

Polypeptides can be produced using standard techniques including those 
involving chemical synthesis and those involving purification from a cell producing the 
polypeptide. Techniques for chemical synthesis of polypeptides are well known in the art. (See 
30 e.g., Vincent, Peptide and Protein Drug Delivery, New York, N.Y., Decker, 1990.) 

Obtaining polypeptides from a cell is facilitated using recombinant nucleic acid 
techniques to produce the polypeptide. Recombinant nucleic acid techniques for producing a 
polypeptide involve introducing, or producing, a recombinant gene encoding the polypeptide in a 
cell and expressing the polypeptide. 
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A recombinant gene contains nucleic acid encoding a polypeptide along with 

regulatory elements for polypeptide expression. The recombinant gene can be present in a 

cellular genome or can be part of an expression vector. 

The regulatory elements that may be present as part of a recombinant gene 
5 include those naturally associated with the polypeptide encoding sequence and exogenous 

regulatory elements not naturally associated with the polypeptide encoding sequence. 

Exogenous regulatory elements such as an exogenous promoter can be useful for expressing a 

recombinant gene in a particular host or increasing the level of expression. Generally, the 

regulatory elements that are present in a recombinant gene include a transcriptional promoter, a 
10 ribosome binding site, a terminator, and an optionally present operator. 

Expression of a recombinant gene in a cell is facilitated through the use of an 

expression vector. Preferably, an expression vector in addition to a recombinant gene also 

contains an origin of replication for autonomous replication in a host cell, a selectable marker, a 

limited number of useful restriction enzyme sites, and a potential for high copy number. 
15 Examples of expression vectors are cloning vectors, modified cloning vectors, specifically 

designed plasmids and viruses. 

Due to the degeneracy of the genetic code, a large number of different encoding 

nucleic acid sequences can be used to code for a particular polypeptide. The degeneracy of the 

genetic code arises because almost all amino acids are encoded by different combinations of 
20 nucleotide triplets or "codons". Amino acids are encoded by codons as follows: 

A=Ala= Alanine: codons GCA, GCC, GCG, GCU 

C=Cys=Cysteine: codons UGC, UGU 

D=Asp=Aspartic acid: codons GAC, GAU 

E=Glu=Glutamic acid: codons GAA, GAG 
25 F=Phe=Phenylalanine: codons UUC, UUU 

G=Gly=Glycine: codons GGA, GGC, GGG, GGU 

H=His=Histidine: codons CAC, CAU 

I=Ile=Isoleucine: codons AUA, AUC, AUU 

K=Lys=Lysine: codons AAA, AAG 
30 L=Leu=Leucine: codons UUA, UUG, CUA, CUC, CUG, CUU 

M=Met=Methionine: codon AUG 

N=Asn=Asparagine: codons AAC, AAU 

P=Pro=Proline: codons CCA, CCC, CCG, CCU 

Q=Gln=Glutamine: codons CAA, CAG 
35 R=Arg=Arginine: codons AGA, AGG, CGA, CGC, CGG, CGU 
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S=Ser=Serine: codons AGC, AGU, UCA, UCC, UCG, UCU 
T=Thr=Threonine: codons ACA, ACC, ACG, ACU 
V=Val=Valine: codons GUA, GUC, GUG, GUU 
W=Trp=Tryptophan: codon UGG 
5 Y=Tyr=Tyrosine: codons UAC, UAU 

Techniques for recombinant gene production, introduction into a cell, and 
recombinant gene expression are well known in the art. Examples of such general techniques 
are provided in references such as Ausubel, Current Protocols in Molecular Biology, John 
Wiley, 1987-2002, and Sambrook et al, Molecular Cloning, A Laboratory Manual, 2 nd Edition, 

10 Cold Spring Harbor Laboratory Press, 1989. 

Methods applying recombinant gene production to HCV RNA-dependent RNA 
polymerase expression are described in the scientific literature and the Examples provided 
below. The purification of the full-length enzyme from insect cells transfected with a 
baculoviral vector has been described. (Lohmann et al, J. Virol 77:8416-8428, 1997; De 

15 Francesco et al., Meth. Enzymol. 275: 58-67, 1996). The full length enzyme has also been 
purified from E. coli. (Oh era/, J. Virol. 73:7694-76702, 1999). 

The C-terminal region of the HCV RNA polymerase contains a stretch of highly 
hydrophobic amino acids that decrease the solubility of the enzyme in the absence of detergent 
and likely serve as a membrane anchor in vivo. Forms of the HCV RNA polymerase with the C- 

20 terminus truncated to remove these hydrophobic amino acids have been expressed in and 

purified from E. coli using conventional column chromatography. (Yamashita et al, J. Biol. 
Chem. 275:15479-15486, 1998; Ferrari et al, J. Virol. 73:1649-1654, 1999; Carroll et al., 
Biochemistry 39: 8243-8249, 2000; Luo et al, J. Virol 74:851-63, 2000; Leveque et al, J. Virol. 
77:9020-9028, 2003.) 

25 

NS5B Assays 

Techniques for measuring HCV RNA-dependent RNA polymerase activity are 
well known in the art. Examples of techniques for measuring HCV RNA-dependent RNA 
polymerase activity are provided in the references cited in the prior section concerning HCV 
30 expression and purification. 

Examples 

Examples are provided below further illustrating different features of the present 
invention. The examples also illustrate useful methodology for practicing the invention. These 
35 examples do not limit the claimed invention. 
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Example 1: Rescue and Characterization of NS5B 
NS5B genes were rescued and characterized from the sera of chronically infected 
chimpanzees. Total RNA was isolated from serum samples of chimpanzees chronically infected 
5 with HCV using the QIAGEN RNeasy Mini Kit according to manufacturer's instructions 

(QIAGEN, Inc. Valencia, CA). Total RNA (5 to 10 microliters) was used as a template for the 
reverse transcriptase reaction (Superscript II RT, Invitrogen Life Technologies, Carlsbad, CA) 
with a 34 nucleotide dATP primer. RT reactions were heat inactivated at 65°C for 15 minutes, 
and then digested with 1 p.L each RNAseH and RNAseTl (Roche Applied Science, Indianapolis, 
10 IN) at 37°C for 20 minutes to remove RNA prior to PCR. Nested PCR was performed using 

Expand High Fidelity PCR System (Roche Applied Science, Indianapolis, IN) and the following 

primers: 

Genotype 2a 

PCR1, forward 5 '-CTCCGTCGTGTGCTGCGCCATGTC 
15 reverse 34 nucleotide dATP 

PCR2, forward 5'TCATACTCTTGGACCGGGGCTCT 
reverse 5 'GTGCCGCTCTATCG AGCGGGGAGT 

Genotype 2b 

20 PCR1 , forward 5 ' - AT ACTCCTGG AC AGGGGCCCT 
reverse 34 nucleotide dATP 
PCR2, forward 5' ATACTCCTGGACAGGGGCCCT 
reverse 5 'CCGCTCTACCG AGCGGGGAGT 

25 Genotype 3 a 

PCR1, forward 5 '-GAGCGTGGTCTGCTGCTCTATGTC 

reverse 5'- 34 nucleotide dATP 
PCR2, forward 5 ' - AT AAT ATG ATC AC ACC ATGT AGTGCTGAGG 

reverse 5 ' -CC AGCTC ACCGTGCTGGC AGG 

30 

Genotype 4a 

PCR1, forward 5 '-GATCGGAGGACGTCGTGTGCTGTT 

reverse 5'- 34 nucleotide dATP 
PCR2, forward 5 '-GTTCGATGTCATACTCGTGGACTG 
35 reverse 5 ' - AAGCTGCCT ACCGAGC AGGC AGC A 
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Genotype 6a 

PCR1, forward 5 ' -CTAAGCTC AGGCTCTTGGTCC ACT 
reverse 5'- 34 nucleotide dATP 
5 PCR2, forward 5 '-GACGACGTCGTATGTTGTTCCATG 

reverse 5 '-CTACCGAGCGGGGAGCAAAAAGATG 

PCR products were cloned into pGEM-T and individual clones sequenced. Genotype was 
confirmed based upon closest homology to prototype sequences listed in GenBank. 

10 

Example 2: Construction of NS5B Expression Clones 
The BKNS5B A21 gene (Carroll et al, J. Biol. Chem. 275:11979-11984,2003) 
was modified by standard molecular biology techniques to encode the sequence Leu-Glu-His- 
His-His-His-His-His (CTCGAGCACCACCACCACCACCAC) at the C-terminal end of the 
15 NS5B A21 coding sequence after codon 570, and then followed by a stop codon. The Leu-Glu 
pair is encoded by a unique Xhol site that is just in front of the histidine tag. The vector was 
further modified to encode a unique Bell sites at NS5B codon 10. This vector served as a 
template to subclone additional NS5B genes for protein expression as Bcll-Xho fragments. 

SEQ ID NOs: 1-5 all initiate with the first 10 codons of genotype lb BK 
20 sequences. NS5B genes were cloned in frame as Bcll-Xhol fragments using clone specific PCR 
primers. The NS5B constructs lacked the C-terminal 21 residues, which previously was 
demonstrated to increase solubility. All constructs were verified by DNA sequencing. 

Example 3: Bacterial Expression of NS5B A21 Enzymes 
25 Glycerol stocks were used as seed cultures for large-scale purification. Glycerol 

stocks were prepared by transforming DNA into Rosetta™ (DE3) competent cells (Novagen). A 
20 mL overnight culture of Luria-Bertani (LB) broth (containing 50 ug/mL ampicillin, 34 
ug/mL chloramphenicol) was inoculated from a single colony. Cells were collected by 
centrifugation and used to inoculate a 1 L culture of LB broth with 100 ug/mL ampicillin only, 
30 and grown to mid-log phase (A 6 oo of 0.4-0.5). To generate glycerol stocks, cells were again 
collected by centrifugation and resuspended, per liter of culture, in 50 mL ice cold LB broth. 
Then 500 ul aliquots of cells were individually mixed with 500 ul of 50% glycerol, placed into 
storage vials, quick frozen on dry ice and kept at -70°C until use. 

For large-scale growth, a glycerol stock was plated on LB plates containing 50 
35 ug/mL ampicillin and 34 ug/mL chloramphenicol (Teknova), incubated overnight at 37°C, 
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collected through scraping, and used as an inoculum for a 200 mL starter culture. After -15 
minutes of shaking at 225 rpm at 37°C, 20 mL of the starter culture was used to seed 980 mL of 
LB broth containing 100 ng/mL ampicillin. The cultures were grown to an optical density of 
A 60 o nm of ~0.7, and induced with 1 mM of isopropylthio-B-galactoside (IPTG from Invitrogen 
5 Life Technologies Inc.). The temperature and shaking were then lowered to 18°C and 210 rpm 
for the 18 hour induction period. Cells were collected by centrifugation and stored at -70°C until 
use. 

Example 4: Purification of NS5B A21 

10 All steps in the purification were performed on ice or in a refrigerated 4°C cold 

room, and with pre-chilled buffers. Cell pellets were resuspended with 200 mL of lysis buffer 
(20 mM Tris-HCl pH 7.5, 10% glycerol, 0.5 M KC1, 5 mM MgCl2, 2 mM B-mercaptoefhanol (13- 
ME), 0.2% n-octylglucoside, Complete EDTA-Free Protease Inhibitors from Roche Diagnostics 
Corp.). To this was added 5,O0OU DNase I (grade I, Roche) and incubated with stirring for 10 

15 minutes. This mixture was dounce homogenized until the lysate was homogenous, then 

fluidized with three passes thru the Micro fluidizer (model 1 10Y, Micro fluidics Corporation). 
The fluidized lysate was centrifuged at 15,000 rpm for 30 minutes in a JA-17 rotor (Beckman 
Coulter). 

The supernatant was collected, mixed with 5 mL of packed TALON® CellThru 
20 resin (Cobalt affinity resin, Clontech), and incubated for 1 hour with gentle agitation to allow 
sample binding. The mixture was centrifuged at 1750 rpm in the GH-3.8 rotor (Beckman 
Coulter) for 5 minutes to pellet the resin. The protein-bound resin was washed with 5 column 
volumes of Wash-EQ buffer (20 mM Tris-HCl pH 7.5, 10% glycerol, 0.5 M KC1, 2 mM BME, 
0.2% n-octylglucoside) for 5 minutes, the resin pelleted by centrifugation at 1750 rpm in the 
25 GH-3.8 rotor for 2 minutes, and the supernatant removed. This wash procedure was repeated an 
additional four times. The resin was then washed a final time with 5 column volumes of Wash 
buffer (20 mM Tris-HCl pH 7.5, 10% glycerol, 0.5 M KC1, 2 mM BME, 0.2% n-octylglucoside, 
1 0 mM Imidazole). 

To elute protein, the resin was resuspended with 1 column volume of elution 
30 buffer (20 mM Tris-HCl pH 7.5, 10% glycerol, 0.5 M KC1, 2 mM B-ME, 0.2% n-octylglucoside, 
200 mM Imidazole) and incubated with gentle agitation for 10 minutes. The resin was pelleted 
by centrifugation at 1750 rpm in the GH-3.8 rotor for 2 minutes, the eluate collected, and EDTA 
added to a final concentration of 1 mM. The elution procedure was repeated twice more, but the 
eluates were kept separate. The eluates were then dialyzed in dialysis buffer (20 mM Tris-HCl 
35 pH 7.5, 10% glycerol, 0.5 M KC1, 3 mM dithiothreitol (DTT), 0.2% n-octylglucoside) with a 
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change of buffer. Concentrated eluate fractions (> 50 % of the most concentrated fraction) were 
combined, aliquoted, quick frozen on dry ice, and stored enzyme at -70°C until use. 

Protein quantitation was performed using Pierce's Coomassie Plus Protein 
reagent and Molecular Devices Spectra Max 250 with the SOFTmaxPRO v3.1.1 software. 
5 Protein visualization was performed using 4-15% gradient Tris-HCl SDS PAGE gels (Bio-Rad) 
and Bio-Safe Coomassie (Bio-rad). Protein purity was determined by quantitation using the 
Storm860 and ImageQuant software (Molecular Dynamics). 

Example 5: Polymerase Assay 

10 The genotype 2a (SEQ ID NO: 1), 2b (SEQ ID NO: 2), and 3a (SEQ ID NO: 3) 

polymerases were titrated in activity-linearity assays in a final concentration range between 62.5 
nM to 1500 nM (1250 nM for the SEQ ID NO: 3 enzyme). Polymerase was pre-incubated for 1 
hour at room temperature with 0.75 pg per reaction of t500 RNA template (IB A GMBH) in a 
volume of 45 pi. t500 RNA template is comprised of bases 3504-4004 of the HCV BK genome 

15 and corresponds to the NS2/3 region as previously described (Carroll et al, Biochemistry 

39:8243-8249, 2000). The following final buffer conditions were: 20 mM Tris-HCl pH 7.5; 50 
uM EDTA; 5 mM DTT; 2 mM MgCl 2 ; 80 mM KC1; 0.4 U/uL rRNAsin (Promega). 

The reaction was initiated by the addition of 5 pi of a nucleotide triphosphate 
cocktail which consisted of 10 pM each ATP, CTP, UTP, and GTP (Ultrapure NTP set from 

20 Amersham Biosciences) which had been spiked with 0.2 pi of a 33 P GTP (10 mCi/ml, Perkin 
Elmer Life Sciences). Assay conditions for genotype 4a (SEQ ID NO: 4) and 6a (SEQ ID NO: 
5) enzymes were identical to that described for SEQ ID NOs: 1-3 except that the nucleotide 
concentrations were 100 pM each. The final enzyme reaction volume was 50 pi. To quench the 
reaction, 20 pL of 0.5 M EDTA was added. For quantitation, 50 pL of the quenched reaction 

25 was blotted onto DE81 Whatman filter disks, dried, washed ten times with 200 mL of 0.3 M 
ammonium formate pH 8.0, ethanol rinsed, dried, imaged with Storm860/ImageQuant, and 
quantitated by liquid scintillation counting. The results are shown in Tables 1 and 2. By way of 
comparison, a A21 histidine tagged HCV BK NS5B purified and assayed under similar 
conditions had a specific activity of 74 nmol/ hr*mg. 

30 
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Table 1 



SEQ ID NO: 


Specific Activity 
[nmol/(hr*mg)] 


1 


2 


2 


15 


3 


147 


Table 2 


SEQ ID NO: 


Specific Activity 
[nmol/(hr*mg)] 


4 


2 


5 


2 



Other embodiments are within the following claims. While several embodiments 
have been shown and described, various modifications may be made without departing from the 
spirit and scope of the present invention. 
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WHAT IS CLAIMED IS: 



1. A purified polypeptide comprising an amino acid sequence selected from 
the group consisting of: SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, and 
5 SEQ ID NO: 5. 



2. The polypeptide of claim 1, wherein said polypeptide consists of the 
amino acid sequence of SEQ ID NO: 1. 

10 3. The polypeptide of claim 1, wherein said polypeptide consists of the 

amino acid sequence of SEQ ID NO: 2. 

4. The polypeptide of claim 1, wherein said polypeptide consists of the 
amino acid sequence of SEQ ID NO: 3. 

15 

5. The polypeptide of claim 1, wherein said polypeptide consists of the 
amino acid sequence of SEQ ID NO: 4. 



6. The polypeptide of claim 1, wherein said polypeptide consists of the 
20 amino acid sequence of SEQ ID NO: 5. 



7. A recombinant nucleic acid comprising a nucleotide sequence encoding an 
amino acid sequence selected from the group consisting of: SEQ ID NO: 1, SEQ ID NO: 2, SEQ 
ID NO: 3, SEQ ID NO: 4, and SEQ ID NO: 5. 

25 

8. The nucleic acid of claim 7, wherein said nucleotide sequence encodes the 
amino acid sequence of SEQ ID NO: 1 . 



9. The nucleic acid of claim 7, wherein said nucleotide sequence encodes the 
30 amino acid sequence of SEQ ID NO: 2. 

10. The nucleic acid of claim 7, wherein said nucleotide sequence encodes the 
amino acid sequence of SEQ ID NO: 3. 



13 



11. The nucleic acid of claim 7, wherein said nucleotide sequence encodes the 
amino acid sequence of SEQ ID NO: 4. 

12. The nucleic acid of claim 7, wherein said nucleotide sequence encodes the 
amino acid sequence of SEQ ID NO: 5. 

13. The nucleic acid of claim 7, wherein said nucleic acid is an expression 

vector. 

14. The nucleic acid of claim 13, wherein said nucleotide sequence is selected 
from the group consisting of SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, 
SEQ ID NO: 9, and SEQ ID NO: 10. 

15. A method of evaluating the ability of a compound to inhibit HCV RNA- 
dependent RNA polymerase comprising the step of measuring the ability of said compound to 
inhibit the activity of one or more HCV RNA-dependent RNA polymerases selected from the 
group consisting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, and SEQ ID 
NO: 5. 

16. The method of claim 15, wherein said method comprises the use of two or 
more of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, and SEQ ID NO: 5. 

17. The method of claim 16, wherein said method comprising the use of SEQ 
ID NO: 1, SEQ ID NO: 2 and SEQ ID NO: 3. 
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ABSTRACT OF THE DISCLOSURE 

The present invention features NS5B polypeptides from different clinically 
important HCV genotypes. The polypeptides can be used individually, or as part of a panel of 
RNA-dependent RNA polymerases, to evaluate the effectiveness of a compound to inhibit NS5B 
5 activity. 
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MSMSYTWTGALITPCSPEEEKLPINPLSNSLLRYHNKVYCTTTKSASLRAKKVTFDRMQVLDSYYDSVLKDIKLA 
ASKVTARLLTMEEACQLTPPHSARSKYGFGAKEVRSLSGRAVNHIKSVWKDLLEDSETPIPTTIMAKNEVFCVDP 
TKGGKKAARLIVYPDLGVRVCEKMALYDITQKLPQAVMGASYGFQYSPAQRVEFLLKAWAEKKDPMGFSYDTRCF 
DSTVTERDIRTEESIYRACSLPEEAHTAIHSLTERLYVGGPMFNSKGQTCGYRRCRASGVLTTSMGNTITCYVKA 
LAACKAAGIIAPTMLVCGDDLVVISESQGTEEDERNLRAFTEAMTRYSAPPGDPPRPEYDLELITSCSSNVSVAL 
GPQGRRRYYLTRDPTTPIARAAWETVRHSPVNSWLGNIIQYAPTIWARMVLMTHFFS ILMAQDTLDQNLNFEMYG 
AVYSVSPLDLPAI IERLHGLDAFSLHTYTPHELTRVASALRKLGAPPLRAWKSRARAVRASLISRGGRAAVCGRY 
LFNWAVKTKLKLTPLPEARLLDLSSWFTVGAGGGDI YHSVSRARPR 

Fig. 1A 



MSMSYTWTGALITPCGPEEEKLPINPLSNSLMRFHNKVYSTTSRSASLRAKKVTFDRVQVLDAHYDSVLQDVKRA 
ASKVSARLLTVEEACALTPPHSAKSRYGFGAKEVRSLSRRAVNHIRSVWEDLLEDQHTPIDTTIMAKNEVFCIDP 
TKGGKKPARLIVYPDLGVRVCEK^4ALYDIAQKLPKAIMGPSYGFQYSPAERVDFLLKAWGSKKDPMGFSYDTRCF 
DSTVTERDIRTEESIYQACSLPQEARTVIHSLTERLYVGGPMTNSKGQSCGYRRCRASGVFTTSMGNTMTCYIKA 
LAACKAAGIVDPVMLVCGDDLVVISESQGNEEDERNLRAFTEAMTRYSAPPGDLPRPEYDLELITSCSSNVSVAL 
DSRGRRRYFLTRDPTTPITRAAWETVRHSPVNSWLGNI IQYAPTIWVRMVIMTHFFS ILLAQDTLNQNLNFEMYG 
AVYSVNPLDLPAI IERLHGLEAFSLHTYSPHELSRVAATLRKLGAPPLRAWKSRARAVRASLIAQGARAAICGRY 
LFNWAVKTKLKLTPLPEASRLDLSGWFTVGAGGGDIYHSVSHARPR 

Fig. IB 



MSMSYTWTGALITPCSAEEEKLPISPLSNSLLRHHNLVYSTSSRSASQRQRKVTFDRLQVLDDHYKTALKEVKER 
ASRVKARMLTIEEACALVPPHSARSKFGYSAKDVRSLSSRAIDQIRSVWEDLLEDTTTPIPTTIMAKNEVFCVDP 
AKGGRKPARLIVYPDLGVRVCEKRALYDVIQKLSIETMGSAYGFQYSPQQRVERLLKMWTSKKTPLGFSYDTRCF 
DSTVTEQDIRVEEEI YQCCNLEPEARKVISSLTERLYCGGPMFNSKGAQCGYRRCRASGVLPTSFGNTITCYIKA 
TAAAKAAGLRNPDFLVCGDDLVVVAESDGVDEDRAALRAFTEAMTRYSAPPGDAPQPTYDLELITSCSSNVSVAR 
DDKGRRYYYLTRDATTPLARAAWETARHTPVNSWLGNI IMYAPTIWVRMVMMTHFFS ILQSQEILDRPLDFEMYG 
ATYSVTPLDLPAI IERLHGLSAFTLHSYSPVELNRVAGTLRKLGCPPLRAWRHRARAVRAKLI AQGGKAKICGLY 
LFNWAVRTKTNLTPLPATGQLDLSSWFTVGVGGNDI YHSVSRARTR 
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MSMSYTWTGALVTPCAAEESKLPISPLSNSLLRHHNMVYATTTRSAVTRQKKVTFDRLQVVDSHYNEVLKEIKAR 
ASRVKARLLTTEEACDLTPPHSARSKFGYGAKDVRSHSRKAINHISSVWKDLLDDNNTPI PTTIMAKNEVFAVNP 
AKGGRKPARLIVYPDLGVRVCEKRALHDVIKKLPEAVMGAAYGFQYSPAQRVEFLLTAWKSKKTPMGFSYDTRCF 
DSTVTEKDIRVEEEVYQCCDLEPEARKVITALTDRLYVGGPMHNSKGDLCGYRRCRASGVYTTSFGNTLTCYLKA 
TAAIRAAGLRDCTMLVCGDDLVVIAESDGVEEDNRALRAFTEAMTRYSAPPGDAPQPAYDLELITSCSSNVSVAH 
DVTGKKVYYLTRDPETPLARAAWETVRHTPVNSWLGNIIVYAPTIWVRMILMTHFFSILQSQEALEKALDFDMYG 
VTYSITPLDLPAI IQRLHGLSAFTLHGYSPHELNRVAGALRKLGVPPLRAWRHRARAVRAKLIAQGGRAKICGI Y 
LFNWAVKTKLKLTPLPAAAKLDLSGWFTVGAGGGDI YHSMSHARPR 

Fig. ID 



MSMSYTWTGALITPCAAEEEKLPINPLSNSLIRHHNMVYSTTSRSASLRQKKVTFDRVQVFDQHYQEILKEIKLR 
ASKVQAKLLSVEEACDLTPSHSARSKYGYGAQDVRSHASKAVNHIRSVWEDLLEDSDTPI PTTIMAKNEVFCVDP 
SKGGRKPARLIVYPDLGVRVCEKMALYDVTQKLPQAVMGSAYGFQYSPTQRVEYLLKMWRSKKVPMGFSYDTRCF 
DSTVTERDIRTENDI YQSCQLDPVARRAVSSLTERLYVGGPMVNSKGQSCGYRRCRASGVLPTSMGNTITCYLKA 
QAACRAANIKDCDMLVCGDDLVVICESAGVQEDTESLRAFTDAMTRYSAPPGDAPQPTYDLELITSCSSNVSVAH 
DGNGKRYYYLTRDCTTPLARAAWETARHTPVNSWLGNIIMFAPTIWVRMVLMTHFFSILQSQEQLEKALDFDIYG 
VTYSVSPLDLPAI IQRLHGMAAFSLHGYSPVELNRVGACLRKLGVPPLRAWRHRARAVRAKLIAQGGKAAICGKY 
LFNWAVKTKLKLTPLVSASKLDLSGWFVAGYDGGDIYHSVSQARPR 
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ATGTCAATGTCGTATACATGGACAGGCGCCTTGATCACTCCTTGTAGTCCCGAAGAGGAGAAGTTACCGATTAAC 
CCCTTGAGCAACTCCCTGTTGCGATATCACAACAAGGTGTACTGTACCACAACAAAGAGCGCCTCACTAAGGGCT 
AAAAAGGTAACTTTTGATAGGATGCAAGTGCTCGACTCCTACTACGACTCAGTCTTAAAGGACATTAAGCTAGCG 
GCCTCCAAGGTCACCGCAAGGCTCCTCACCATGGAGGAGGCTTGCCAGTTAACCCCACCCCATTCTGCAAGATCT 
AAATATGGGTTTGGGGCTAAGGAGGTCCGCAGCTTGTCCGGGAGGGCCGTTAACCACATCAAGTCCGTGTGGAAG 
GACCTCCTGGAGGACTCAGAAACACCAATTCCCACAACCATTATGGCCAAAAATGAGGTGTTCTGCGTGGACCCC 
ACCAAGGGGGGCAAGAAAGCAGCTCGCCTTATCGTTTACCCTGACCTCGGCGTCAGGGTCTGCGAGAAGATGGCC 
CTTTATGACATTACACAAAAACTTCCTCAGGCGGTGATGGGGGCTTCTTATGGATTCCAGTATTCCCCCGCTCAG 
CGGGTAGAGTTTCTCTTGAAAGCATGGGCGGAAAAGAAGGACCCTATGGGTTTTTCGTATGATACCCGATGCTTT 
GACTCAACCGTCACTGAGAGAGACATCAGGACTGAGGAGTCCATATATCGGGCCTGCTCCTTGCCCGAGGAGGCC 
CACACTGCCATACACTCGCTAACTGAGAGACTTTACGTGGGAGGGCCTATGTTCAACAGCAAGGGCCAAACCTGC 
GGGTACAGGCGTTGCCGCGCCAGCGGGGTGCTCACCACTAGCATGGGGAACACCATCACATGCTACGTGAAAGCC 
TTAGCGGCTTGTAAAGCTGCAGGGATAATCGCGCCCACAATGCTGGTATGCGGCGATGACTTGGTTGTCATCTCA 
GAAAGCCAGGGGACCGAGGAGGACGAGCGGAACCTGAGAGCCTTCACGGAGGCTATGACCAGGTATTCTGCCCCT 
CCTGGTGACCCCCCCAGACCGGAGTATGATCTGGAGCTGATAACATCTTGCTCCTCAAATGTGTCTGTGGCGCTG 
GGCCCACAAGGCCGCCGCAGATACTACCTGACCAGAGACCCTACCACTCCAATCGCCCGGGCTGCCTGGGAAACA 
GTTAGACACTCCCCTGTCAATTCATGGCTGGGAAACATCATCCAGTACGCCCCGACCATATGGGCTCGCATGGTC 
CTGATGACACACTTCTTCTCCATTCTCATGGCTCAAGACACGCTGGACCAGAACCTCAACTTTGAGATGTACGGA 
GCGGTGTACTCCGTGAGTCCCTTGGACCTCCCAGCTATAATTGAAAGGTTACATGGGCTTGACGCTTTTTCTCTG 
CACACATACACTCCCCACGAACTGACACGGGTGGCTTCAGCCCTCAGAAAACTTGGGGCGCCACCCCTCAGAGCG 
TGGAAGAGCCGGGCACGTGCAGTCAGGGCGTCCCTCATCTCCCGTGGGGGGAGAGCGGCCGTCTGCGGTCGATAT 
CTCTTCAACTGGGCGGTGAAGACCAAGCTCAAACTCACTCCATTGCCGGAGGCGCGCCTCCTGGATTTATCCAGC 
TGGTTCACCGTCGGCGCCGGCGGGGGCGACATTTATCACAGCGTGTCGCGTGCCCGACCACGC 



Fig. 2A 



ATGTCAATGTCCTACACATGGACAGGCGCCTTGATCACACCATGTGGGCCCGAAGAGGAGAAGTTACCGATCAAC 
CCTCTGAGTAATTCGCTCATGCGGTTCCATAATAAGGTGTACTCCACAACCTCAAGGAGTGCCTCTCTGAGGGCA 
AAGAAGGTGACTTTTGACAGGGTGCAGGTGCTGGACGCACACTATGACTCAGTCTTGCAGGACGTTAAGCGGGCC 
GCCTCTAAGGTTAGTGCGAGGCTCCTCACGGTAGAGGAAGCCTGCGCGCTGACCCCGCCCCACTCCGCCAAATCG 
CGATACGGATTTGGGGCAAAAGAGGTGCGCAGCTTATCCAGGAGGGCCGTTAACCACATCCGGTCCGTGTGGGAG 
GACCTCCTGGAAGACCAACATACCCCAATTGACACAACTATCATGGCTAAAAATGAGGTGTTCTGCATTGATCCA 
ACTAAAGGTGGGAAAAAGCCAGCTCGCCTCATCGTATACCCCGACCTTGGGGTCAGGGTGTGCGAAAAGATGGCC 
CTCTATGACATCGCACAAAAGCTTCCCAAAGCGATAATGGGGCCATCCTATGGGTTCCAATACTCTCCCGCAGAA 
CGGGTCGATTTCCTCCTCAAAGCTTGGGGAAGTAAGAAGGACCCAATGGGGTTCTCGTATGACACCCGCTGCTTT 
GACTCAACCGTCACGGAGAGGGACATAAGAACAGAAGAATCCATATATCAGGCTTGTTCTCTGCCTCAAGAAGCC 
AGAACTGTCATACACTCGCTCACTGAGAGACTTTACGTAGGAGGGCCCATGACAAACAGCAAAGGGCAATCCTGC 
GGCTACAGGCGTTGCCGCGCAAGCGGTGTTTTCACCACCAGCATGGGGAATACCATGACATGTTACATCAAAGCC 
CTTGCAGCGTGTAAGGCTGCAGGGATCGTGGACCCTGTTATGTTGGTGTGTGGAGACGACCTGGTCGTCATCTCA 
GAGAGCCAAGGTAACGAGGAGGACGAGCGAAACCTGAGAGCTTTCACGGAGGCTATGACCAGGTATTCCGCCCCT 
CCCGGTGACCTTCCCAGACCGGAATATGACTTGGAGCTTATAACATCCTGCTCCTCAAACGTATCGGTAGCGCTG 
GACTCTCGGGGTCGCCGCCGGTACTTCCTAACCAGAGACCCTACCACTCCAATCACCCGAGCTGCTTGGGAAACA 
GTAAGACACTCCCCTGTCAATTCTTGGCTGGGCAACATCATCCAGTACGCCCCCACAATCTGGGTCCGGATGGTC 
ATAATGACTCACTTCTTCTCCATACTATTGGCCCAGGACACTCTGAACCAAAATCTCAATTTTGAGATGTACGGG 
GCAGTATACTCGGTCAATCCATTAGACCTACCGGCCATAATTGAAAGGCTACATGGGCTTGAAGCCTTTTCACTG 
CACACATACTCTCCCCACGAACTCTCACGGGTGGCAGCAACTCTCAGAAAACTTGGAGCGCCTCCCCTTAGAGCG 
TGGAAGAGTCGGGCGCGTGCCGTGAGAGCTTCACTCATCGCCCAAGGAGCGAGGGCGGCCATTTGTGGCCGCTAC 
CTCTTCAACTGGGCGGTGAAAACAAAGCTCAAACTCACTCCATTGCCCGAGGCGAGCCGCCTGGATTTATCCGGG 
TGGTTCACCGTGGGCGCCGGCGGGGGCGACATTTATCACAGCGTGTCGCATGCCCGACCCCGC 
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ATGTCAATGTCGTATACATGGACAGGCGCCTTGATCACACCATGTAGTGCTGAGGAGGAGAAACTGCCCATCAGC 
CCACTCAGCAATTCTTTGTTGAGACATCATAACCTAGTCTATTCAACGTCGTCGAGAAGCGCTTCCCAGCGTCAG 
AGGAAGGTTACCTTCGACAGACTGCAGGTGCTCGACGACCATTATAAGACTGCATTAAAGGAGGTGAAGGAGCGA 
GCGTCTAGGGTGAAGGCCCGCATGCTCACCATCGAGGAAGCGTGCGCGCTCGTCCCTCCTCACTCTGCCCGGTCG 
AAGTTCGGGTATAGTGCGAAGGACGTTCGCTCCTTGTCCAGCAGGGCCATTGACCAGATCCGCTCCGTCTGGGAG 
GACCTGCTGGAAGACACCACAACTCCAATTCCAACCACCATCATGGCGAAGAACGAGGTGTTTTGTGTGGACCCC 
GCTAAAGGGGGCCGCAAGCCCGCTCGCCTCATTGTGTACCCTGACCTGGGGGTGCGTGTCTGTGAGAAACGCGCC 
CTATATGACGTGATACAGAAGTTGTCAATTGAGACGATGGGTTCCGCTTATGGATTCCAATACTCGCCTCAACAG 
CGGGTCGAACGTCTACTGAAGATGTGGACCTCAAAGAAAACCCCCTTGGGGTTCTCATATGACACCCGCTGCTTT 
GACTCAACTGTCACTGAACAGGACATCAGGGTAGAAGAGGAGATATATCAATGCTGTAACCTTGAACCGGAGGCC 
AGGAAAGTGATCTCCTCCCTCACGGAGCGGCTTTACTGCGGGGGCCCTATGTTCAACAGCAAGGGGGCCCAGTGT 
GGTTATCGCCGTTGCCGTGCCAGTGGAGTTCTGCCTACCAGCTTTGGCAACACAATCACTTGTTACATCAAGGCC 
ACAGCGGCCGCGAAGGCCGCAGGCCTCCGGAACCCGGACTTTCTCGTCTGCGGAGATGATTTGGTCGTGGTGGCT 
GAAAGTGACGGCGTCGATGAGGATAGAGCAGCCCTGAGAGCCTTCACGGAGGCTATGACCAGGTACTCTGCTCCA 
CCCGGAGATGCCCCACAGCCCACCTATGACCTTGAGCTCATTACATCTTGCTCCTCTAACGTCTCCGTAGCACGG 
GACGACAAGGGGAGGAGGTATTATTACCTCACCCGTGATGCCACTACTCCCCTAGCCCGCGCGGCTTGGGAAACA 
GCCCGTCACACTCCAGTCAACTCCTGGTTAGGTAACATCATCATGTACGCGCCTACTATCTGGGTGCGCATGGTA 
ATGATGACACACTTTTTCTCCATACTCCAATCCCAGGAGATACTTGATCGACCCCTTGACTTTGAAATGTACGGG 
GCCACTTACTCTGTCACTCCGCTGGATTTACCAGCAATCATTGAAAGACTCCATGGTCTAAGCGCATTTACGCTC 
CACAGTTACTCTCCAGTAGAGCTCAATAGGGTCGCGGGGACACTCAGGAAGCTTGGGTGCCCCCCCCTACGAGCT 
TGGAGACATCGGGCACGAGCAGTGCGCGCCAAGCTTATCGCCCAGGGAGGGAAGGCCAAAATATGTGGCCTTTAT 
CTCTTCAATTGGGCGGTACGCACCAAGACCAATCTCACTCCACTGCCAGCCACTGGCCAGTTGGACTTGTCCAGC 
TGGTTTACGGTTGGTGTCGGCGGGAACGACATTTATCACAGCGTGTCACGTGCCCGAACCCGC 



Fig- 2C 



ATGTCAATGTCGTATACATGGACAGGCGCCTTGGTAACACCTTGCGCGGCTGAGGAATCAAAGCTGCCAATTAGC 
CCCCTGAGCAATTCACTTTTGCGCCATCACAATATGGTGTATGCCACGACCACCCGTTCTGCTGTGACACGGCAG 
AAGAAGGTGACCTTCGACCGCCTGCAGGTGGTGGACAGTCACTACAATGAAGTGCTTAAGGAGATAAAGGCACGA 
GCATCCAGAGTGAAGGCACGCTTGCTTACCACAGAGGAAGCTTGCGACCTGACGCCCCCCCACTCAGCCAGATCA 
AAGTTCGGCTACGGGGCGAAGGATGTTCGGAGCCATTCCCGCAAGGCCATTAACCACATCAGCTCCGTGTGGAAG 
GACTTGCTGGACGACAACAATACCCCAATACCAACAACAATCATGGCCAAAAATGAGGTCTTCGCTGTGAACCCA 
GCGAAGGGAGGTCGGAAGCCTGCTCGCCTGATCGTGTATCCGGATCTCGGGGTCCGGGTTTGCGAGAAGAGAGCG 
CTTCACGACGTCATCAAAAAACTGCCTGAGGCCGTGATGGGAGCCGCTTATGGCTTCCAATACTCCCCAGCGCAG 
CGGGTGGAATTTCTTCTGACTGCTTGGAAGTCGAAGAAGACCCCAATGGGGTTCTCTTATGATACCCGCTGCTTT 
GACTCCACTGTAACCGAAAAGGACATCAGGGTCGAGGAAGAGGTCTATCAGTGTTGTGACCTGGAGCCCGAAGCC 
CGCAAAGTCATCACCGCCCTCACAGATAGACTCTATGTGGGCGGCCCTATGCACAACAGCAAGGGAGACCTTTGT 
GGGTATCGGAGATGTCGCGCAAGCGGCGTCTACACCACCAGCTTCGGGAACACGCTGACGTGCTATCTCAAAGCC 
ACGGCCGCCATCAGGGCGGCGGGGCTGAGAGACTGCACTATGTTGGTTTGCGGTGATGACTTAGTCGTCATCGCT 
GAGAGCGACGGCGTAGAGGAGGACAACCGAGCCCTCCGAGCCTTCACGGAGGCTATGACGAGATACTCGGCTCCC 
CCAGGTGACGCCCCGCAGCCAGCATATGACCTGGAACTAATAACATCATGTTCATCCAACGTCTCAGTCGCGCAC 
GACGTGACGGGTAAAAAGGTATATTACCTAACCCGAGACCCTGAAACTCCCTTGGCGCGAGCCGCATGGGAGACA 
GTCCGACACACTCCAGTCAATTCCTGGTTGGGAAACATCATAGTCTACGCTCCCACAATATGGGTGCGCATGATA 
TTGATGACCCACTTTTTCTCAATACTCCAGAGCCAGGAAGCCCTTGAGAAAGCACTCGACTTCGATATGTACGGA 
GTCACCTACTCTATCACTCCGCTGGATTTACCGGCAATCATTCAAAGACTCCATGGCTTAAGCGCGTTCACGCTG 
CACGGATACTCTCCACACGAACTCAACCGGGTGGCCGGAGCCCTCAGAAAACTTGGGGTACCCCCGCTGAGAGCG 
TGGAGACATCGGGCCCGAGCAGTCCGCGCTAAGCTTATCGCCCAGGGAGGTAGAGCCAAAATATGTGGCATATAC 
CTCTTTAACTGGGCGGTAAAAACCAAACTCAAACTCACTCCATTGCCTGCCGCTGCCAAACTCGATTTATCGGGT 
TGGTTTACGGTAGGCGCCGGCGGGGGAGACATTTATCACAGCATGTCTCATGCCCGACCCCGC 



Fig. 2D 



ATGTCAATGTCGTATACATGGACAGGCGCCTTGATAACACCATGTGCTGCGGAGGAGGAGAAGCTTCCAATAAAT 
CCTCTGAGCAACTCCCTCATAAGACACCATAACATGGTGTATTCCACCACATCACGCAGCGCCAGCCTCCGCCAG 
AAGAAGGTCACATTTGACAGAGTGCAAGTGTTCGACCAACATTACCAGGAAATACTAAAGGAGATTAAGCTTCGA 
GCGTCCAAGGTGCAGGCGAAGCTCTTATCCGTAGAGGAAGCCTGCGACCTCACACCATCGCACTCAGCCCGGTCC 
AAATATGGGTATGGTGCACAGGACGTTAGAAGCCATGCTAGCAAGGCCGTCAACCACATCCGCTCCGTGTGGGAG 
GACTTGCTAGAAGACTCTGATACTCCAATTCCCACAACCATCATGGCTAAGAATGAAGTCTTCTGCGTAGATCCG 
TCGAAGGGTGGACGCAAGCCGGCACGCTTAATAGTTTACCCAGACTTGGGCGTGCGGGTCTGCGAGAAGATGGCC 
CTATACGACGTCACGCAGAAGTTACCACAGGCCGTGATGGGTTCAGCATACGGATTCCAGTACTCCCCCACCCAG 
AGGGTTGAGTACCTGCTCAAAATGTGGCGGTCAAAGAAGGTGCCTATGGGCTTTTCTTACGACACCAGGTGTTTT 
GATTCAACCGTCACTGAGCGGGACATCCGGACTGAGAACGACATCTATCAGTCTTGCCAGCTGGATCCCGTAGCA 
AGGAGGGCAGTATCATCCCTAACGGAACGGCTCTACGTAGGCGGCCCCATGGTGAACTCCAAGGGACAGTCATGT 
GGCTACCGTAGATGCCGAGCCAGTGGGGTGCTGCCCACGAGCATGGGAAACACCATCACGTGCTATCTGAAGGCA 
CAGGCCGCCTGCAGGGCGGCCAACATCAAGGACTGTGACATGTTGGTGTGCGGAGATGACTTAGTGGTCATTTGT 
GAGAGTGCTGGCGTCCAGGAGGACACTGAGTCACTGCGAGCATTCACGGATGCTATGACCAGGTACTCAGCTCCC 
CCTGGAGACGCCCCGCAACCTACTTACGACCTTGAGCTCATAACATCATGCTCATCCAATGTCTCCGTCGCCCAC 
GATGGCAACGGGAAGAGATATTACTACCTCACACGTGACTGTACCACTCCACTTGCGCGGGCCGCCTGGGAGACA 
GCCCGCCACACTCCAGTCAACTCGTGGTTGGGCAACATCATTATGTTTGCCCCCACGATATGGGTGCGTATGGTT 
CTGATGACCCATTTTTTCTCCATCCTCCAGTCACAAGAGCAATTGGAGAAAGCACTCGACTTTGACATCTATGGA 
GTGACCTATTCCGTCTCTCCACTTGATCTCCCAGCAATCATTCAACGACTCCATGGCATGGCAGCATTTTCACTC 
CACGGATACTCTCCAGTTGAGCTCAATAGGGTAGGGGCTTGCCTCAGGAAACTTGGGGTGCCTCCCTTGCGAGCC 
TGGAGACATCGAGCCAGAGCTGTCAGAGCCAAACTCATTGCCCAAGGGGGGAAAGCGGCCATATGCGGTAAGTAC 
CTCTTTAACTGGGCAGTGAAGACCAAACTAAAACTCACTCCATTGGTCTCCGCGAGCAAGCTTGACTTATCAGGC 
TGGTTCGTGGCCGGCTACGACGGGGGGGACATTTATCACAGCGTGTCCCAGGCTCGACCCCGT 



Fig. 2E 



